Learning Representations for Log Data in Cybersecurity

نویسندگان

  • Ignacio Arnaldo
  • Alfredo Cuesta-Infante
  • Ankit Arun
  • Mei Lam
  • Costas Bassias
  • Kalyan Veeramachaneni
چکیده

We introduce a framework for exploring and learning representations of log data generated by enterprise-grade security devices with the goal of detecting advanced persistent threats (APTs) spanning over several weeks. The presented framework uses a divide-and-conquer strategy combining behavioral analytics, time series modeling and representation learning algorithms to model large volumes of data. In addition, given that we have access to human-engineered features, we analyze the capability of a series of representation learning algorithms to complement human-engineered features in a variety of classification approaches. We demonstrate the approach with a novel dataset extracted from 3 billion log lines generated at an enterprise network boundaries with reported command and control communications. The presented results validate our approach, achieving an area under the ROC curve of 0.943 and 95 true positives out of the Top 100 ranked instances on the test data set.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Science Methodology for Cybersecurity Projects

Cybersecurity solutions are traditionally static and signature-based. The traditional solutions along with the use of analytic models, machine learning and big data could be improved by automatically trigger mitigation or provide relevant awareness to control or limit consequences of threats. This kind of intelligent solutions is covered in the context of Data Science for Cybersecurity. Data Sc...

متن کامل

Proceedings of the Fiftieth Annual Meeting of the American Association for the Advancement of Science.

Notices of the AMs 73 of Western Ontario Modeling Honey Bee-Plant Symbiosis in the Presence of Environmental Toxins New features of the 2017 meeting are fifteen minute " flash talks. " Keith Devlin, Stanford University, will presenting one of these on Saturday, February 18, 2017: 1:30 pm–1:45 pm; the title is " Symbols of Success: New Representations for Teaching and Doing Mathematics. " In the...

متن کامل

Effectiveness of Cognitive Captain's Log Software on Visual-Spatial Perception of Student with Learning Disabilities

Purpose: The purpose of this study was the Effectiveness cognitive Captain's Log software on visual-spatial perception for student with learning disability. Method: This research was a  pretest-posttest design with control group. The statistical population consisted of all students with learning disabilities who were referred to educational and rehabilitation centers of students with specific l...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Instructional Perspective: Towards an Integrative Learning Approach in Cybersecurity

This paper describes a multifaceted approach to cybersecurity education based on integrative learning theory. We emphasize the need to focus on curriculum, experiential learning techniques, assessment and fostering a community of practice. The need to build conceptual, tactical and practical skills among cybersecurity professionals is highlighted. The paper will include examples of how integrat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017